Joy Arulraj - Research Statement

نویسنده

  • Joy Arulraj
چکیده

Modern data-intensive applications apply statistical analysis algorithms onmassive databases to deliver qualitatively better results in many domains, including science, governance, and business. These applications require low latency, always-on, and cost-effective data management. This places increased demands on today’s database management systems (DBMSs) whose architectures are tailored for the hardware landscape of the s. The capabilities of nascent hardware technologies invalidate the long-held design assumptions these systems make about memory and storage. This shift argues for research into rearchitecting DBMSs for a new era of heterogeneous, dis-aggregated, and domainspeci c hardware architectures. Given this outlook, the central theme of my research is on the development of new database management systems that leverage the characteristics of emergent hardware technologies to meet the requirements of data-intensive applications. In particular, my current research focuses on a new class of non-volatile memory (NVM) technologies that blur the gap between volatile memory and durable storage. NVM supports low latency byte-addressable accesses similar to DRAM, but all writes are persistent like SSDs. There are several aspects of NVM that make existing DBMS architectures inappropriate for them. My research investigates how to rearchitect the DBMS from the ground-up to take advantage of NVM []. I redesign the fundamental algorithms and data structures employed in traditional DBMSs to leverage the persistence and performance characteristics of NVM. This enables the DBMS to support low latency transactions, instantaneous recovery from system failures, and cost-effective data management. My work shows that NVM’s impact straddles across all the layers of the DBMS, including logging and recovery [], storagemanagement [, ], indexing [], and query execution []. I plan to continue collaborating with partners in both academia and industry to identify shortcomings in existing systems for supporting data-intensive applications. I will take a principled approach to solving these problems by devising analytical models informed by both theory and practice. I will then validate these models by building systems and evaluate their efficacy using scienti c experimentation. With the widespread adoption of systems for performing large-scale data science, my research will help move the state-of-the-art forward in this eld. I will next discuss the two primary areas of my current research agenda: non-volatile memory and self-driving DBMSs. These projects are a testament to my overall vision of redesigning DBMSs to take advantage of breakthroughs in other branches of computer science. I conclude with a discussion of the research directions that I intend to pursue in future.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

BzTree: A High-Performance Latch-free Range Index for Non-Volatile Memory

Storing a database (rows and indexes) entirely in non-volatile memory (NVM) potentially enables both high performance and fast recovery. To fully exploit parallelism on modern CPUs, modern main-memory databases use latch-free (lock-free) index structures, e.g. Bw-tree or skip lists. To achieve high performance NVMresident indexes also need to be latch-free. This paper describes the design of th...

متن کامل

Database Management Systems for Non-Volatile Memory

Changes in computer trends have given rise to new on-line transaction processing (OLTP) applications that support a large number of concurrent users and systems. What makes these modern applications unlike their predecessors is the scale at which they ingest information. Database management systems (DBMSs) are the critical component of these applications because they are responsible for ensurin...

متن کامل

SlimDB: A Space-Efficient Key-Value Storage Engine For Semi-Sorted Data

Modern key-value stores often use write-optimized indexes and compact in-memory indexes to speed up read and write performance. One popular write-optimized index is the Logstructured merge-tree (LSM-tree) which provides indexed access to write-intensive data. It has been increasingly used as a storage backbone for many services, including file system metadata management, graph processing engine...

متن کامل

An Empirical Evaluation of In-Memory Multi-Version Concurrency Control

Multi-version concurrency control (MVCC) is currently the most popular transaction management scheme in modern database management systems (DBMSs). Although MVCC was discovered in the late 1970s, it is used in almost every major relational DBMS released in the last decade. Maintaining multiple versions of data potentially increases parallelism without sacrificing serializability when processing...

متن کامل

Write-Behind Logging

The design of the logging and recovery components of database management systems (DBMSs) has always been influenced by the difference in the performance characteristics of volatile (DRAM) and non-volatile storage devices (HDD/SSDs). The key assumption has been that non-volatile storage is much slower than DRAM and only supports block-oriented read/writes. But the arrival of new nonvolatile memo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017